Systems and methods for decoding of systematic forward error correction (FEC) codes of selected data in a video bitstream

ABSTRACT

The invention is related to methods and apparatus that advantageously reconstruct and decode video data, such as video object planes (VOPs), using forward error correction (FEC) codes embedded in the video bitstream. Advantageously, the original video data can be recovered even when portions of the video bitstream are corrupted or lost during transmission. Further advantageously, the methods and apparatus disclosed are backward compatible with video bitstreams that are compliant with standard syntax, thereby allowing a decoder to achieve compatibility with both standard video bitstreams and video bitstreams embedded with FEC codes. In one embodiment, a decoder retrieves the FEC codes from a user data video packet. To save bandwidth, an encoder can provide FEC codes corresponding to a subset of the video data, and the decoder can receive and interpret indications as to which data the provided FEC codes correspond.

RELATED APPLICATION

[0001] This application claims the benefit under 35 U.S.C. §119(e) ofU.S. Provisional Application No. 60/273,443, filed Mar. 5, 2001; U.S.Provisional Application No. 60/275,859, filed Mar. 14, 2001; and U.S.Provisional Application No. 60/286,280, filed Apr. 25, 2001, theentireties of which are hereby incorporated by reference.

APPENDIX A

[0002] Appendix A, which forms a part of this disclosure, is a list ofcommonly owned copending U.S. patent applications. Each one of theapplications listed in Appendix A is hereby incorporated herein in itsentirety by reference thereto.

COPYRIGHT RIGHTS

[0003] A portion of the disclosure of this patent document containsmaterial which is subject to copyright protection. The copyright ownerhas no objection to the facsimile reproduction by any one of the patentdocument or the patent disclosure, as it appears in the Patent andTrademark Office patent file or records, but otherwise reserves allcopyright rights whatsoever.

BACKGROUND OF THE INVENTION

[0004] 1. Field of the Invention

[0005] The invention is related to video decoding techniques. Inparticular, the invention relates to systems and methods of recoveringcorrupted data in video bitstreams by decoding forward error correction(FEC) codes.

[0006] 2. Description of the Related Art

[0007] A variety of digital video compression techniques have arisen totransmit or to store a video signal with a lower bandwidth or with lessstorage space. Such video compression techniques include internationalstandards, such as H.261, H.263, H.263+, H.263++, H.26L, MPEG-1, MPEG-2,MPEG-4, and MPEG-7. These compression techniques achieve relatively highcompression ratios by discrete cosine transform (DCT) techniques andmotion compensation (MC) techniques, among others. Such videocompression techniques permit video bitstreams to be efficiently carriedacross a variety of digital networks, such as wireless cellulartelephony networks, computer networks, cable networks, via satellite,and the like.

[0008] Unfortunately for users, the various mediums used to carry ortransmit digital video signals do not always work perfectly, and thetransmitted data can be corrupted or otherwise interrupted. Suchcorruption can include errors, dropouts, and delays. Corruption occurswith relative frequency in some transmission mediums, such as inwireless channels and in asynchronous transfer mode (ATM) networks. Forexample, data transmission in a wireless channel can be corrupted byenvironmental noise, multipath, and shadowing. In another example, datatransmission in an ATM network can be corrupted by network congestionand buffer overflow.

[0009] Corruption in a data stream or bitstream that is carrying videocan cause disruptions to the displayed video. Even the loss of one bitof data can result in a loss of synchronization with the bitstream,which results in the unavailability of subsequent bits until asynchronization codeword is received. These errors in transmission cancause frames to be missed, blocks within a frame to be missed, and thelike. One drawback to a relatively highly compressed data stream is anincreased susceptibility to corruption in the transmission of the datastream carrying the video signal.

[0010] Those in the art have sought to develop techniques to mitigateagainst the corruption of data in the bitstream. For example, errorconcealment techniques can be used in an attempt to hide errors inmissing or corrupted blocks. However, conventional error concealmenttechniques can be relatively crude and unsophisticated.

[0011] In another example, forward error correction (FEC) techniques areused to recover corrupted bits, and thus reconstruct data in the eventof corruption. However, FEC techniques disadvantageously introduceredundant data, which increases the bandwidth of the bitstream for thevideo or decreases the amount of effective bandwidth remaining for thevideo. Also, FEC techniques are computationally complex to implement. Inaddition, conventional FEC techniques are not compatible with theinternational standards, such as H.261, H.263, MPEG-2, and MPEG-4, butinstead, have to be implemented at a higher, “systems” level.

SUMMARY OF THE INVENTION

[0012] The invention is related to methods and apparatus thatadvantageously reconstruct and decode video data, such as video objectplanes (VOPs), using forward error correction (FEC) codes embedded inthe video bitstream. Advantageously, the original video data can berecovered even when portions of the video bitstream are corrupted orlost during transmission. Further advantageously, the methods andapparatus disclosed are backward compatible with video bitstreams thatare compliant with standard syntax, thereby allowing a decoder toachieve compatibility with both standard video bitstreams and videobitstreams embedded with FEC codes. In one embodiment, a decoderretrieves the FEC codes from a user data video packet. To savebandwidth, an encoder can provide FEC codes corresponding to a subset ofthe video data, and the decoder can receive and interpret indications asto which data the provided FEC codes correspond.

[0013] One embodiment of the invention includes a video decoder adaptedto reconstruct corrupted video data comprising: a receiver circuitadapted to receive a video bitstream; a buffer coupled to the receivercircuit, where the buffer is adapted to store at least a portion of thevideo bitstream; a parsing circuit adapted to distinguish video datafrom forward error correction (FEC) codes; an error monitoring circuitconfigured to detect corruption in the video data; and an FEC decoderadapted to receive the video data and the FEC codes, where the FECdecoder is configured to remove the corruption in the video data towhich the FEC codes apply.

[0014] One embodiment of the invention includes a video decoder thatdecodes a video bitstream that includes forward error correction (FEC)codes, the video decoder comprising: means for receiving the videobitstream, which includes both video data and FEC codes; means forretrieving video data from the video bitstream; means for determining ifthere is corruption in a portion of the video data retrieved; means forretrieving FEC codes from the video bitstream in response to a detectionof corruption; and means for using the FEC codes to reconstruct theportion of the video data such that the portion of the video data isrecovered without corruption.

[0015] One embodiment of the invention includes a process of decoding avideo bitstream that includes forward error correction (FEC) codes, theprocess comprising: receiving the video bitstream, which includes bothvideo data and FEC codes; retrieving video data from the videobitstream; determining if there is corruption in a portion of the videodata retrieved; retrieving FEC codes from the video bitstream inresponse to a detection of corruption; and using the FEC codes toreconstruct the portion of the video data such that the portion of thevideo data is recovered without corruption.

[0016] One embodiment of the invention includes a process of decoding avideo bitstream that includes forward error correction (FEC) codes, theprocess comprising: receiving the video bitstream, which includes bothvideo data and FEC codes; retrieving video data from the videobitstream; determining if FEC codes that correspond to the retrievedvideo data are available; retrieving FEC codes from the video bitstreamwhen the FEC codes are available; and using the FEC codes to decode theportion of the video data such that the portion of the video data isrecovered without corruption.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] These and other features of the invention will now be describedwith reference to the drawings summarized below. These drawings and theassociated description are provided to illustrate preferred embodimentsof the invention and are not intended to limit the scope of theinvention.

[0018]FIG. 1 illustrates a networked system for implementing a videodistribution system in accordance with one embodiment of the invention.

[0019]FIG. 2 illustrates a sequence of frames.

[0020]FIG. 3 is a flowchart generally illustrating a process ofconcealing errors or missing data in a video bitstream.

[0021]FIG. 4 illustrates a process of temporal concealment of missingmotion vectors.

[0022]FIG. 5 is a flowchart generally illustrating a process ofadaptively concealing errors in a video bitstream.

[0023]FIG. 6 is a flowchart generally illustrating a process that canuse weighted predictions to compensate for errors in a video bitstream.

[0024]FIG. 7A illustrates a sample of a video packet with DC and ACcomponents for an I-VOP.

[0025]FIG. 7B illustrates a video packet for a P-VOP.

[0026]FIG. 8 illustrates an example of discarding a corruptedmacroblock.

[0027]FIG. 9 is a flowchart that generally illustrates a processaccording to an embodiment of the invention of partial RVLC decoding ofdiscrete cosine transform (DCT) portions of corrupted packets

[0028] FIGS. 10-13 illustrate partial RVLC decoding strategies.

[0029]FIG. 14 illustrates a partially corrupted video packet with atleast one intra-coded macroblock.

[0030]FIG. 15 illustrates a sequence of macroblocks with AC prediction.

[0031]FIG. 16 illustrates a bit structure for an MPEG-4 datapartitioning packet.

[0032]FIG. 17 illustrates one example of a tradeoff between block errorrate (BER) correction capability versus overhead.

[0033]FIG. 18 illustrates a video bitstream with systematic FEC data.

[0034]FIG. 19 is a flowchart generally illustrating a process ofdecoding systematically encoded FEC data in a video bitstream.

[0035]FIG. 20 is a block diagram generally illustrating one process ofusing a ring buffer in error resilient decoding of video data.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0036] Although this invention will be described in terms of certainpreferred embodiments, other embodiments that are apparent to those ofordinary skill in the art, including embodiments that do not provide allof the benefits and features set forth herein, are also within the scopeof this invention. Accordingly, the scope of the invention is definedonly by reference to the appended claims.

[0037] The display of video can consume a relatively large amount ofbandwidth, especially when the video is displayed in real time.Moreover, when the video bitstream is wirelessly transmitted or istransmitted over a congested network, packets may be lost orunacceptably delayed. Even when a packet of data in a video bitstream isreceived, if the packet is not timely received due to network congestionand the like, the packet may not be usable for decoding of the videobitstream in real time. Embodiments of the invention advantageouslycompensate for and conceal errors that occur when packets of data in avideo bitstream are delayed, dropped, or lost. Some embodimentsreconstruct the original data from other data. Other embodiments concealor hide the result of errors so that a corresponding display of thevideo bitstream exhibits relatively fewer errors, thereby effectivelyincreasing the signal-to-noise ratio (SNR) of the system. Furtheradvantageously, embodiments of the invention can remain downwardcompatible with video bitstreams that are compliant with existing videoencoding standards.

[0038]FIG. 1 illustrates a networked system for implementing a videodistribution system in accordance with one embodiment of the invention.An encoding computer 102 receives a video signal, which is to be encodedto a relatively compact and robust format. The encoding computer 102 cancorrespond to a variety of machine types, including general purposecomputers that execute software and to specialized hardware. Theencoding computer 102 can receive a video sequence from a wide varietyof sources, such as via a satellite receiver 104, a video camera 106,and a video conferencing terminal 108. The video camera 106 cancorrespond to a variety of camera types, such as video camera recorders,Web cams, cameras built into wireless devices, and the like. Videosequences can also be stored in a data store 110. The data store 110 canbe internal to or external to the encoding computer 102. The data store110 can include devices such as tapes, hard disks, optical disks, andthe like. It will be understood by one of ordinary skill in the art thata data store, such as the data store 110 illustrated in FIG. 1, canstore unencoded video, encoded video, or both. In one embodiment, theencoding computer 102 retrieves unencoded video from a data store, suchas the data store 110, encodes the unencoded video, and stores theencoded video to a data store, which can be the same data store oranother data store. It will be understood that a source for the videocan include a source that was originally taken in a film format.

[0039] The encoding computer 102 distributes the encoded video to areceiving device, which decodes the encoded video. The receiving devicecan correspond to a wide variety of devices that can display video. Forexample, the receiving devices shown in the illustrated networked systeminclude a cell phone 112, a personal digital assistant (PDA) 114, alaptop computer 116, and a desktop computer 118. The receiving devicescan communicate with the encoding computer 102 through a communicationnetwork 120, which can correspond to a variety of communication networksincluding a wireless communication network. It will be understood by oneof ordinary skill in the art that a receiving device, such as the cellphone 112, can also be used to transmit a video signal to the encodingcomputer 102.

[0040] The encoding computer 102, as well as a receiving device ordecoder, can correspond to a wide variety of computers. For example, theencoding computer 102 can be any microprocessor or processor(hereinafter referred to as processor) controlled device, including, butnot limited to a terminal device, such as a personal computer, aworkstation, a server, a client, a mini computer, a main-frame computer,a laptop computer, a network of individual computers, a mobile computer,a palm top computer, a hand held computer, a set top box for a TV, aninteractive television, an interactive kiosk, a personal digitalassistant (PDA), an interactive wireless communications device, a mobilebrowser, a Web enabled cell phone, or a combination thereof. Thecomputer may further possess input devices such as a keyboard, a mouse,a trackball, a touch pad, or a touch screen and output devices such as acomputer screen, printer, speaker, or other input devices now inexistence or later developed.

[0041] The encoding computer 102, as well as a decoder, described cancorrespond to a uniprocessor or multiprocessor machine. Additionally,the computers can include an addressable storage medium or computeraccessible medium, such as random access memory (RAM), an electronicallyerasable programmable read-only memory (EEPROM), hard disks, floppydisks, laser disk players, digital video devices, Compact Disc ROMs,DVD-ROMs, video tapes, audio tapes, magnetic recording tracks,electronic networks, and other techniques to transmit or storeelectronic content such as, by way of example, programs and data. In oneembodiment, the computers are equipped with a network communicationdevice such as a network interface card, a modem, Infra-Red (IR) port,or other network connection device suitable for connecting to a network.Furthermore, the computers execute an appropriate operating system, suchas Linux, Unix, Microsoft® Windows® 3.1, Microsoft® Windows® 95,Microsoft® Windows® 98, Microsoft® Windows® NT, Microsoft® Windows®2000, Microsoft® Windows® Me, Microsoft® Windows® XP, Apple® MacOS®,IBM® OS/2®, Microsoft® Windows® CE, or Palm OS®. As is conventional, theappropriate operating system may advantageously include a communicationsprotocol implementation, which handles all incoming and outgoing messagetraffic passed over the network, which can include a wireless network.In other embodiments, while the operating system may differ depending onthe type of computer, the operating system may continue to provide theappropriate communications protocols necessary to establishcommunication links with the network.

[0042]FIG. 2 illustrates a sequence of frames. A video sequence includesmultiple video frames taken at intervals. The rate at which the framesare displayed is referred to as the frame rate. In addition totechniques used to compress still video, motion video techniques relatea frame at time k to a frame at time k−1 to further compress the videoinformation into relatively small amounts of data. However, if the frameat time k−1 is not available due to an error, such as a transmissionerror, conventional video techniques may not be able to properly decodethe frame at time k. As will be explained later, embodiments of theinvention advantageously decode the video stream in a robust manner suchthat the frame at time k can be decoded even when the frame at time k−1is not available.

[0043] The frames in a sequence of frames can correspond to eitherinterlaced frames or to non-interlaced frames, i.e., progressive frames.In an interlaced frame, each frame is made of two separate fields, whichare interlaced together to create the frame. No such interlacing isperformed in a non-interlaced or progressive frame. While illustrated inthe context of non-interlaced or progressive video, the skilled artisanwill appreciate that the principles and advantages described herein areapplicable to both interlaced video and non-interlaced video. Inaddition, while certain embodiments of the invention may be describedonly in the context of MPEG-2 or only in the context of MPEG-4, theprinciples and advantages described herein are applicable to a broadvariety of video standards, including H.261, H.263, MPEG-2, and MPEG-4,as well as video standards yet to be developed. In addition, whilecertain embodiments of the invention may describe error concealmenttechniques in the context of, for example, a macroblock, the skilledpractitioner will appreciate that the techniques described herein canapply to blocks, macroblocks, video object planes, lines, individualpixels, groups of pixels, and the like.

[0044] The MPEG-4 standard is defined in “Coding of Audio-VisualObjects: Systems,” 14496-1, ISO/IEC JTC1/SC29/WG11 N2501, November 1998,and “Coding of Audio-Visual Objects: Visual,” 14496-2, ISO/IECJTC1/SC29/WG11 N2502, November 1998, and the MPEG-4 Video VerificationModel is defined in ISO/IEC JTC1/SC 29/WG11, “MPEG-4 Video VerificationModel 17.0,” ISO/IEC JTC1/SC29/WG11 N3515, Beijing, China, July 2000,the contents of which are incorporated herein in their entirety.

[0045] In an MPEG-2 system, a frame is encoded into multiple blocks, andeach block is encoded into six macroblocks. The macroblocks includeinformation, such as luminance and color, for composing a frame. Inaddition, while a frame may be encoded as a still frame, i.e., anintra-coded frame, frames in a sequence of frames can be temporallyrelated to each other, i.e., predictive-coded frames, and themacroblocks can relate a section of one frame at one time to a sectionof another frame at another time.

[0046] In an MPEG-4 system, a frame in a sequence of frames is furtherencoded into a number of video objects known as video object planes(VOPs). A frame can be encoded into a single VOP or in multiple VOPs. Inone system, such as a wireless system, each frame includes only one VOPso that a VOP is a frame. The VOPs are transmitted to a receiver, wherethey are decoded by a decoder back into video objects for display. A VOPcan correspond to an intra-coded VOP (I-VOP), to a predictive-coded VOP(P-VOP) to a bidirectionally-predictive coded VOP (B-VOP), or to asprite VOP (S-VOP). An I-VOP is not dependent on information fromanother frame or picture, i.e., an I-VOP is independently decoded. Whena frame consists entirely of I-VOPs, the frame is called an I-Frame.Such frames are commonly used in situations such as a scene change.Although the lack of dependence on content from another frame allows anI-VOP to be robustly transmitted and received, an I-VOPdisadvantageously consumes a relatively large amount of data or databandwidth as compared to a P-VOP or B-VOP. To efficiently compress andtransmit video, many VOPs in video frames correspond to P-VOPs.

[0047] A P-VOP efficiently encodes a video object by referencing thevideo object to a past VOP, i.e., to a video object (encoded by a VOP)earlier in time. This past VOP is referred to as a reference VOP. Forexample, where an object in a frame at time k is related to an object ina frame at time k−1, motion compensation encoded in a P-VOP can be usedto encode the video object with less information than with an I-VOP. Thereference VOP can be either an I-VOP or a P-VOP.

[0048] A B-VOP uses both a past VOP and a future VOP as reference VOPs.In a real-time video bitstream, a B-VOP should not be used. However, theprinciples and advantages described herein can also apply to a videobitstream with B-VOPs. An S-VOP is used to display animated objects.

[0049] The encoded VOPs are organized into macroblocks. A macroblockincludes sections for storing luminance (brightness) components andsections for storing chrominance (color) components. The macroblocks aretransmitted and received via the communication network 120. It will beunderstood by one of ordinary skill in the art that the communication ofthe data can further include other communication layers, such asmodulation to and demodulation from code division multiple access(CDMA). It will be understood by one of ordinary skill in the art thatthe video bitstream can also include corresponding audio information,which is also encoded and decoded.

[0050]FIG. 3 is a flowchart 300 generally illustrating a process ofconcealing errors or missing data in a video bitstream. The errors cancorrespond to a variety of problems or unavailability including a lossof data, a corruption of data, a header error, a syntax error, a delayin receiving data, and the like. Advantageously, the process of FIG. 3is relatively unsophisticated to implement and can be executed byrelatively slow decoders.

[0051] Upon the detection of an error, the process starts at a firstdecision block 304. The first decision block 304 determines whether theerror relates to intra-coding or predictive-coding. It will beunderstood by the skilled practitioner that the intra-coding orpredictive-coding can refer to frames, to macroblocks, to video objectplanes (VOPs), and the like. While illustrated in the context ofmacroblocks, the skilled artisan will appreciate that the principles andadvantages described in FIG. 3 also apply to video object planes and thelike. The process proceeds from the first decision block 304 to a firststate 308 when the error relates to an intra-coded macroblock. When theerror relates to a predictive-coded macroblock, the process proceedsfrom the first decision block 304 to a second decision block 312. Itwill be understood that the error for a predictive-coded macroblock canarise from a missing macroblock in a present frame at time t, or from anerror in a reference frame at time t−1 from which motion is referenced.

[0052] In the first state 308, the process interpolates or spatiallyconceals the error in the intra-coded macroblock, termed a missingmacroblock. In one embodiment, the process conceals the error in themissing macroblock by linearly interpolating data from an uppermacroblock that is intended to be displayed “above” the missingmacroblock in the image, and from a lower macroblock that is intended tobe displayed “below” the missing macroblock in the image. Techniquesother than linear interpolation can also be used.

[0053] For example, the process can vertically linearly interpolateusing a line denoted lb copied from the upper macroblock and a linedenoted lt copied from the lower macroblock. In one embodiment, theprocess uses the lowermost line of the upper macroblock as lb and thetopmost line of the lower macroblock as lt.

[0054] Depending on the circumstances, the upper macroblock and/or thelower macroblock may also not be available. For example, the uppermacroblock and/or the lower macroblock may have an error. In addition,the missing macroblock may be located at the upper boundary of an imageor at the lower boundary of the image.

[0055] One embodiment of the invention uses the following rules toconceal errors in the missing macroblock when linear interpolationbetween the upper macroblock and the lower macroblock is not applicable.

[0056] When the missing macroblock is at the upper boundary of theimage, the topmost line of the lower macroblock is used as lb. If thelower macroblock is also missing, the topmost line of the next-lowermacroblock in the image is used as lb, and so forth, if further lowermacroblocks are missing. If all the lower macroblocks are missing, agray line is used as lb.

[0057] When the missing macroblock is at the lower boundary of the imageor the lower macroblock is missing, lb, the lowermost line of the uppermacroblock, is also used as lt.

[0058] When the missing macroblock is neither at the upper boundary ofthe image nor at the lower boundary of the image, and interpolationbetween the upper macroblock and the lower macroblock is not applicable,one embodiment of the invention replaces the missing macroblock withgray pixels (Y=U=V=128 value).

[0059] According to one decoding standard, MPEG-4, pixels that areassociated with a block with an error are stored as a “0,” whichcorresponds to green pixels in a display. Gray pixels can be closer thangreen to the colors associated with a missing block, and simulationtests have observed a 0.1 dB improvement over the green pixels withrelatively little or no increase in complexity. For example, the graypixel color can be implemented by a copy instruction. When the spatialconcealment is complete, the process ends.

[0060] When the error relates to a predictive-coded macroblock, thesecond decision block 312 determines whether another motion vector isavailable to be used for the missing macroblock. For example, the videobitstream may also include another motion vector, such as a redundantmotion vector, which can be used instead of a standard motion vector inthe missing macroblock. In one embodiment, a redundant motion vector isestimated by doubling the standard motion vector. One embodiment of theredundant motion vector references motion in the present frame at time tto a frame at time t−2. When both the frame at time t−2 and theredundant motion vector are available, the process proceeds from thesecond decision block 312 to a second state 316, where the processreconstructs the missing macroblock from the redundant motion vector andthe frame at time t−2. Otherwise, the process proceeds from the seconddecision block 312 to a third decision block 320.

[0061] In the third decision block 320, the process determines whetherthe error is due to a predictive-coded macroblock missing in the presentframe, i.e., missing motion vectors. When the motion vectors aremissing, the process proceeds from the third decision block 320 to athird state 324. Otherwise, the process proceeds from the third decisionblock 320 to a fourth decision block 328.

[0062] In the third state 324, the process substitutes the missingmotion vectors in the missing macroblock to provide temporal concealmentof the error. One embodiment of temporal concealment of missing motionvectors is described in greater detail later in connection with FIG. 4.The process advances from the third state 324 to the fourth decisionblock 328.

[0063] In the fourth decision block 328, the process determines whetheran error is due to a missing reference frame, e.g., the frame at timet−1. If the reference frame is available, the process proceeds from thefourth decision block 328 to a fourth state 332, where the process usesthe reference frame and the substitute motion vectors from the thirdstate 324. Otherwise, the process proceeds to a fifth state 336.

[0064] In the fifth state 336, the process uses a frame at time t−k as areference frame. Where the frame corresponds to the previous-previousframe, k can equal 2. In one embodiment, the process multiplies themotion vectors that were received in the macroblock or substituted inthe third state 324 by a factor, such as 2 for linear motion, to concealthe error. The skilled practitioner will appreciate that otherappropriate factors may be used depending on the motion characteristicsof the video images. The process proceeds to end until the next error isdetected.

[0065]FIG. 4 illustrates an exemplary process of temporal concealment ofmissing motion vectors. In one embodiment, a macroblock includes fourmotion vectors. In the illustrated temporal concealment technique, themissing motion vectors of a missing macroblock 402 are substituted withmotion vectors copied from other macroblocks. In another embodiment,which will be described later, the missing motion vectors of the missingmacroblock 402 are substituted with motion vectors interpolated fromother macroblocks.

[0066] When the missing macroblock 402 is below and above othermacroblocks in the image, the process copies motion vectors from anupper macroblock 404, which is above the missing macroblock 402, andcopies motion vectors from a lower macroblock 406, which is below themissing macroblock 402.

[0067] The missing macroblock 402 corresponds to a first missing motionvector 410, a second missing motion vector 412, a third missing motionvector 414, and a fourth missing motion vector 416. The upper macroblock404 includes a first upper motion vector 420, a second upper motionvector 422, a third upper motion vector 424, and a fourth upper motionvector 426. The lower macroblock 406 includes a first lower motionvector 430, a second lower motion vector 432, a third lower motionvector 434, and a fourth lower motion vector 436.

[0068] When both the upper macroblock 404 and the lower macroblock 406are available and include motion vectors, the illustrated process usesthe third upper motion vector 424 as the first missing motion vector410, the fourth upper motion vector 426 as the second missing motionvector 412, the first lower motion vector 430 as the third missingmotion vector 414, and the second lower motion vector 432 as the fourthmissing motion vector 416.

[0069] When the missing macroblock 402 at the upper boundary of theimage, the process sets both the first missing motion vector 410 and thesecond missing motion vector 412 to the zero vector (no motion). Theprocess uses the first lower motion vector 430 as the third missingmotion vector 414, and the second lower motion vector 432 as the fourthmissing motion vector 416.

[0070] When the lower macroblock 406 is corrupted or otherwiseunavailable and/or the missing macroblock 402 is at the lower boundaryof the image, the process sets the third missing motion vector 414 equalto the value used for the first missing motion vector 410, and theprocess sets the fourth missing motion vector 416 equal to the valueused for the second missing motion vector 412.

[0071] In one embodiment, the missing motion vectors of the missingmacroblock 402 are substituted with motion vectors interpolated fromother macroblocks. A variety of techniques for interpolation exist. Inone example, the first missing motion vector 410 is substituted with avector sum of the first upper motion vector 420 and 3 times the thirdupper motion vector 424, i.e., v1₄₁₀=v1₄₂₀+(3)(v3₄₂₄). In anotherexample, the third missing motion vector 414 can be substituted with avector sum of the third lower motion vector 434 and 3 times the firstlower motion vector 430, i.e., v3₄₁₄=(3)(v1₄₃₀)+v3₄₃₄.

[0072]FIG. 5 is a flowchart 500 generally illustrating a process ofadaptively concealing errors in a video bitstream. Advantageously, theprocess of FIG. 5 adaptively selects a concealment mode such that theerror-concealed or reconstructed images can correspond to relativelyless distorted image. Simulation tests predict improvements of up toabout 1.5 decibels (dB) in peak signal to noise ratio. The process ofFIG. 5 can be used to select an error concealment mode even when datafor a present frame is received without an error.

[0073] For example, the process can receive three consecutive frames. Afirst frame is cleanly received. A second frame is received with arelatively high-degree of corruption. Data for a third frame is cleanlyreceived, but reconstruction of a portion of the third frame depends onportions of the second frame, which was received with a relativelyhigh-degree of corruption. Under certain conditions, it can beadvantageous to conceal portion of the third frame because portions ofthe third frame depend on a portions of a corrupted frame. The processillustrated in FIG. 5 can advantageously identify when error concealmenttechniques should be invoked even when such error concealment techniqueswould not be needed by standard video decoders to provide a display ofthe corresponding image.

[0074] The process starts in a first state 504, where the processreceives data from the video bitstream for the present frame, i.e., theframe at time t. A portion of the received data may be missing, due toan error, such as a dropout, corruption, delay, and the like. Theprocess advances from the first state 504 to a first decision block 506.

[0075] In the first decision block 506, the process determines whetherthe data under analysis corresponds to an intra-coded video object plane(I-VOP) or to a predictive-coded VOP (P-VOP). It will be understood byone of ordinary skill in the art that the process can operate atdifferent levels, such as on macroblocks or frames, and that a VOP canbe a frame. The process proceeds from the first decision block 506 to asecond decision block 510 when the VOP is an I-VOP. Otherwise, i.e., theVOP is a P-VOP, the process proceeds to a third decision block 514.

[0076] In the second decision block 510, the process determines whetherthere is an error in the received data for the I-VOP. The processproceeds from the second decision block 510 to a second state 518 whenthere is an error. Otherwise, the process proceeds to a third state 522.

[0077] In the second state 518, the process conceals the error withspatial concealment techniques, such as the spatial concealmenttechniques described earlier in connection with the first state 308 ofFIG. 3. The process advances from the second state 518 to a fourth state526.

[0078] In the fourth state 526, the process sets an error value to anerror predicted for the concealment technique used in the second state518. One embodiment normalizes the error to a range between 0 and 255,where 0 corresponds to no error, and 255 corresponds to a maximum error.For example, where gray pixels replace a pixel in an error concealmentmode, the error value can correspond to 255. In one embodiment, theerror value is retrieved from a table of pre-calculated error estimates.In spatial interpolation, the pixels adjacent to error-free pixels aretypically more faithfully concealed than the pixels that are fartheraway from the error-free pixels. In one embodiment, an error value ismodeled as 97 for pixels adjacent to error-free pixels, while otherpixels are modeled with an error value of 215. The error values can bemaintained in a memory array on a per-pixel basis, can be maintained foronly a selection of pixels, can be maintained for groups of pixels, andso forth.

[0079] In the third state 522, the process has received an error-freeI-VOP and clears (to zero) the error value for the corresponding pixelsof the VOP. Of course, other values can be arbitrarily selected toindicate an error-free state. The process advances from the third state522 to a fifth state 530, where the process constructs the VOP from thereceived data and ends. The process can be reactivated to process thenext VOP received.

[0080] Returning to the third decision block 514, the process determineswhether the P-VOP includes an error. When there is an error, the processproceeds from the third decision block 514 to a fourth decision block534. Otherwise, the process proceeds to an optional sixth state 538.

[0081] In the fourth decision block 534, the process determines whetherthe error values for the corresponding pixels are zero or not. If theerror values are zero and there is no error in the data of the presentP-VOP, then the process proceeds to the fifth state 520 and constructsthe VOP with the received data as this corresponds to an error-freecondition. The process then ends or waits for the next VOP to beprocessed. If the error values are non-zero, then the process proceedsto a seventh state 542.

[0082] In the seventh state 542, the process projects the estimate errorvalue, i.e., a new error value, that would result if the process usesthe received data. For example, if a previous frame contained an error,that error may propagate to the present frame by decoding and using theP-VOP of the present frame. In one embodiment, the estimated error valueis about 103 plus an error propagation term, which depends on theprevious error value. The error propagation term can also include a“leaky” value, such as 0.93, to reflect a slight loss in errorpropagation per frame. The process advances from the seventh state 542to an eighth state 546.

[0083] In the eighth state 546, the process projects the estimated errorvalue that would result if the process used an error resiliencetechnique. The error resilience technique can correspond to a widevariety of techniques, such as an error concealment technique describedin connection with FIGS. 3 and 4, the use of additional motion vectorsthat reference other frames, and the like. Where the additional motionvector references the previous-previous frame, one embodiment uses anerror value of 46 plus the propagated error. It will be recognized thata propagated error in a previous frame can be different than apropagated error in a previous-previous frame. In one embodiment, theprocess projects the estimated error values that would result from aplurality of error resilience techniques. The process advances from theeighth state 546 to a ninth state 550.

[0084] In the ninth state 550, the process selects between using thereceived data and using an error resilience technique. In oneembodiment, the process selects between using the received data andusing one of multiple error resilience techniques. The construction,concealment, or reconstruction technique that provides the lowestprojected estimated error value is used to construct the correspondingportion of the image. The process advances from the ninth state 550 to atenth state 554, where the process updates the affected error valuesaccording to the selected received data or error resilience techniqueused to generate the frame, and the process ends. It will be understoodthat the process can then wait until the next VOP is received, and theprocess can reactivate to process the next VOP.

[0085] In the optional sixth state 538, the process computes theprojected error values with multiple error resilience techniques. Theerror resilience technique that indicates the lowest projected estimatederror value is selected. The process advances from the optional sixthstate 538 to an eleventh state 558.

[0086] In the eleventh state 558, the process applies the errorresilience technique selected in the optional sixth state 538. Where theprocess uses only one error resilience technique to conceal errors forP-VOPs, the skilled practitioner will appreciate that the optional sixthstate 538 need not be present, and the process can apply the errorresilience technique in the eleventh state 558 without a selectionprocess. The process advances from the from the eleventh state 558 to atwelfth state 562, where the process updates the corresponding errorvalues in accordance with the error resilience technique applied in theeleventh state 558. The process then ends and can be reactivated toprocess future VOPs.

[0087]FIG. 6 is a flowchart 600 generally illustrating a process thatcan use weighted predictions to compensate for errors in a videobitstream. One embodiment of the process is relatively less complex toimplement than adaptive techniques. The illustrated process receives aframe of data and processes the data one macroblock at a time. It willbe understood that when errors in transmission arise, the process maynot receive an entire frame of data. Rather, the process can startprocessing the present frame upon other conditions, such as determiningthat the timeframe for receiving the frame has expired, or receivingdata for the subsequent frame, and the like.

[0088] The process starts in a first decision block 604, where theprocess determines whether the present frame is a predictive-coded frame(P-frame) or is an intra-coded frame (I-frame). The process proceedsfrom the first decision block 604 to a second decision block 608 whenthe present frame corresponds to an I-frame. When the present framecorresponds to a P-frame, the process proceeds from the first decisionblock 604 to a third decision block 612.

[0089] In the second decision block 608, the process determines whetherthe macroblock under analysis includes an error. The macroblock underanalysis can correspond to the first macroblock of the frame and endwith the last macroblock of the frame. However, the order of analysiscan vary. The error can correspond to a variety of anomalies, such asmissing data, syntax errors, checksum errors, and the like. The processproceeds from the second decision block 608 to a first state 616 when noerror is detected in the macroblock. If an error is detected in themacroblock, the process proceeds to a second state 620.

[0090] In the first state 616, the process decodes the macroblock. Allmacroblocks of an intra-coded frame are intra-coded. An intra-codedmacroblock can be decoded without reference to other macroblocks. Theprocess advances from the first state 616 to a third state 624, wherethe process resets an error variance (EV) value corresponding to a pixelin the macroblock to zero. The error variance relates to a predicted orexpected amount of error propagation. Since the intra-coded macroblockdoes not depend on other macroblocks, an error-free intra-codedmacroblock can be expected to have an error variance of zero. It will beunderstood by one of ordinary skill in the art that any number can bearbitrarily selected to represent zero. It will also be understood thatthe error variance can be tracked in a broad variety of ways, includingon a per pixel basis, on groups of pixels, on selected pixels, permacroblock, and the like. The process advances from the third state 624to a fourth decision block 628.

[0091] In the fourth decision block 628, the process determines whetherit has processed the last macroblock in the frame. The process returnsfrom the fourth decision block 628 to the second decision block 608 whenthere are further macroblocks in the frame to be processed. When thelast macroblock has been processed, the process ends and can bereactivated when for the subsequent frame.

[0092] In the second state 620, the process conceals the error withspatial concealment techniques, such as the spatial concealmenttechniques described earlier in connection with the first state 308 ofFIG. 3. In one embodiment, the process fills the pixels of themacroblock with gray, which is encoded as 128. The process advances fromthe second state 620 to a fourth state 632, where the process sets themacroblock's corresponding error variance, σ_(H) ², to a predeterminedvalue, σ_(HΓ) ². In one embodiment, the error variance, σ_(H) ², isnormalized to a range between 0 and 255. The predetermined value can beobtained by, for example, simulation results, real world testing, andthe like. In addition, the predetermined value can depend on theconcealment technique. In one embodiment, where the concealmenttechnique is to fill the macroblock with gray, the predetermined value,σ_(HΓ) ², is 255. The process advances from the fourth state 632 to thefourth decision block 628.

[0093] When the frame is a P-frame, the process proceeds from the firstdecision block 604 to the third decision block 612. In the thirddecision block 612, the process determines whether the macroblock underanalysis includes an error. The process proceeds from the third decisionblock 612 to a fifth decision block 636 when no error is detected. Whenan error is detected, the process proceeds from the third decision block612 to a fifth state 640.

[0094] A macroblock in a P-frame can correspond to either an inter-codedmacroblock or to an intra-coded macroblock. In the fifth decision block636, the process determines whether the macroblock corresponds to aninter-coded macroblock or to an intra-coded macroblock. The processproceeds from the fifth decision block 636 to a sixth state 644 when themacroblock corresponds to an intra-coded macroblock. When the macroblockcorresponds to an inter-coded macroblock, the process proceeds to aseventh state 648.

[0095] In the sixth state 644, the process proceeds to decode theintra-coded macroblock that was received without an error. Theintra-coded macroblock can be decoded without reference to anothermacroblock. The process advances from the sixth state 644 to an eighthstate 652, where the process resets the corresponding error variancesmaintained for the macroblock to zero. The process advances from theeighth state 652 to a sixth decision block 664.

[0096] In the sixth decision block 664, the process determines whetherit has processed the last macroblock in the frame. The process returnsfrom the sixth decision block 664 to the third decision block 612 whenthere are further macroblocks in the frame to be processed. When thelast macroblock has been processed, the process ends and can bereactivated for the subsequent frame.

[0097] In the seventh state 648, the process reconstructs the pixels ofthe macroblock even when the macroblock was received without error.Reconstruction in this circumstance can improve image quality because aprevious-previous frame may exhibit less corruption than aprevious-frame. One embodiment of the process selects between a firstreconstruction mode and a second reconstruction mode depending on whichmode is expected to provide better error concealment. In anotherembodiment, weighted sums are used to combine the two modes. In oneexample, the weights used correspond to the inverse of estimated errorsso that the process decodes with minimal mean squared error (MMSE).

[0098] In the first reconstruction mode, the process reconstructs themacroblock based on the received motion vector and the correspondingportion in the previous frame. The reconstructed pixel, {circumflex over(q)}_(k), as reconstructed by the first reconstruction mode, isexpressed in Equation 1. In Equation 1, {circumflex over (r)}_(k) is aprediction residual.

{circumflex over (q)} _(k) ={circumflex over (p)} _(k−1) +{circumflexover (r)} _(k)  (Eq. 1)

[0099] In the second reconstruction mode, the process reconstructs themacroblock by doubling the amount of motion specified by the motionvectors of the macroblock, and the process uses a corresponding portionof the previous-previous frame, i.e., the frame at time k−2.

[0100] The error variance of a pixel reconstructed by the firstreconstruction mode, σ_(p) _(k−1) ², is expressed in Equation 2, where kindicates the frame, e.g., k=0 for the present frame. The error varianceof a pixel reconstructed by the second reconstruction mode, σ_(p) _(k−2)², is expressed in Equation 3.

σ_(p) _(k−1) ² =E{({circumflex over (p)} _(k−1) −{tilde over (p)}_(k−1))²}  (Eq. 2) $\begin{matrix}{\begin{matrix}{\sigma_{P_{k - 2}}^{2} = \quad {E\left\{ \left( {{\hat{p}}_{k - 1} - {\overset{\sim}{p}}_{k - 2}} \right)^{2} \right\}}} \\{\cong \quad {{E\left\{ \left( {{\hat{p}}_{k - 1} - {\overset{\sim}{p}}_{k - 2}} \right)^{2} \right\}} + {E\left( {{\hat{p}}_{k - 1} - {\overset{\sim}{p}}_{k - 2}} \right)}^{2}}} \\{= \quad {\sigma_{H\quad \Theta}^{2} + \sigma_{P_{k - 2}}^{2}}}\end{matrix}\quad} & \left( {{Eq}.\quad 3} \right)\end{matrix}$

[0101] In one embodiment, the process selects the second reconstructionmode when σ_(p) _(k−1) ²>σ_(HΘ) ²+σ_(p) _(k−2) ². In another embodiment,weighted sums are used to combine the reconstruction techniques. In oneexample, the weights used correspond to the inverse of predicted errorsso that the process decodes with minimal mean squared error (MMSE). Withweighted sums, the process combines the two predictions to reconstructthe pixel, qk. In one embodiment, the pixel qk is reconstructed by{circumflex over (q)}_(k), as expressed in Equation 4.

{tilde over (q)}_(k) =β{tilde over (p)} _(k−1)+(1−β){tilde over (p)}_(k−2) +{circumflex over (r)} _(k)  (Eq. 4)

[0102] In one embodiment, the weighting coefficient, β, is calculatedfrom Equation 5. $\begin{matrix}{\beta = \frac{\sigma_{H\quad \Theta}^{2} + \sigma_{p_{k - 2}}^{2}}{\sigma_{H\quad \Theta}^{2} + \sigma_{p_{k - 1}}^{2} + \sigma_{p_{k - 2}}^{2}}} & \left( {{Eq}.\quad 5} \right)\end{matrix}$

[0103] The process advances from the seventh state 648 to a ninth state656. In the ninth state 656, the process updates the corresponding errorvariances for the macroblock based on the reconstruction applied in theseventh state 648. The process advances from the from the ninth state656 to the sixth decision block 664. In one embodiment, the errorvariance is calculated from expression in Equation 6. $\begin{matrix}{\sigma_{q_{k}}^{2} = \frac{\sigma_{p_{k - 1}}^{2}\left( {\sigma_{H\quad \Theta}^{2} + \sigma_{p_{k - 2}}^{2}} \right)}{\sigma_{H\quad \Theta}^{2} + \sigma_{p_{k - 1}}^{2} + \sigma_{p_{k - 2}}^{2}}} & \left( {{Eq}.\quad 6} \right)\end{matrix}$

[0104] In the fifth state 640, the process conceals the errors in themacroblock. A variety of concealment techniques can be applied. In oneembodiment, the process uses temporal concealment, regardless of whetherthe macroblock is intra-coded or inter-coded. It will be understood thatin other embodiments, the type of coding used in the macroblock can beused as a factor in the selection of a concealment technique.

[0105] One embodiment of the process selects between a first concealmentmode based on a previous frame and a second concealment mode based on aprevious-previous frame in the fifth state 640. In the first concealmentmode, the process generates an inter-coded macroblock for the missingmacroblock using the motion vectors extracted from a macroblock that isabove the missing macroblock in the image. If the macroblock that isabove the missing macroblock has an error, the motion vectors can be setto zero vectors. The corresponding portion of the frame is reconstructedwith the generated inter-coded macroblock and the correspondingreference information from the previous frame, i.e., the frame at t−1.

[0106] In the second concealment mode, the process generates aninter-coded macroblock for the missing macroblock by copying andmultiplying by 2 the motion vectors extracted from a macroblock that isabove the missing macroblock in the image. If the macroblock above themissing macroblock has an error, the motion vectors can be set to zerovectors. The corresponding portion of the frame is reconstructed withthe generated inter-coded macroblock and the corresponding referenceinformation from the previous-previous frame, i.e., the frame at t−2.

[0107] The error variance can be modeled as a sum of the associatedpropagation error and concealment error. In one embodiment, the firstconcealment mode has a lower concealment error than the secondconcealment mode, but the second concealment mode has a lowerpropagation error than the first concealment mode.

[0108] In one embodiment, the process selects between the firstconcealment mode and the second concealment mode based on which oneprovides a lower estimated error variance. In another embodiment,weighted sums are used to combine the two modes. In Equation 7,σ_(qk(i)) ², denotes the error variance of a pixel qk. The value of i isequal to 1 for the first concealment mode based on the previous frameand is equal to 2 for the second concealment mode based on theprevious-previous frame. $\begin{matrix}{\begin{matrix}{\sigma_{q\quad {k{(i)}}}^{2} = \quad {E\left\{ \left( {{\hat{q}}_{k} - {\hat{c}}_{k - i}} \right)^{2} \right\}}} \\{\cong \quad {{E\left\{ \left( {{\hat{q}}_{k} - {\hat{c}}_{k - i}} \right)^{2} \right\}} + {E\left\{ \left( {{\hat{c}}_{k - i} - {\overset{\sim}{c}}_{k - i}} \right)^{2} \right\}}}} \\{= \quad {\sigma_{H\quad {\Delta {(i)}}}^{2} + \sigma_{c_{k - 1}}^{2}}}\end{matrix}\quad} & \left( {{Eq}.\quad 7} \right)\end{matrix}$

[0109] In Equation 7, σ_(HΔ(i)) ² corresponds to the error variance forthe concealment mode and σ_(c) _(k−1) ² corresponds to the propagationerror variance.

[0110] In another embodiment, the process computes weighted sums tofurther reduce the error variance of the concealment. For example,{circumflex over (q)}_(k) can be replaced by {tilde over (q)}_(k) asshown in Equation 8.

{tilde over (q)} _(k) =α{tilde over (c)} _(k−1)+(1−α){tilde over (c)}_(k−2)  (Eq. 8)

[0111] In one embodiment, the weighting coefficient, a, is as expressedin Equation 9. $\begin{matrix}{\alpha = \frac{\sigma_{q_{k}{(2)}}^{2}}{\sigma_{q_{k}{(1)}}^{2} + \sigma_{q_{k}{(2)}}^{2}}} & \left( {{Eq}.\quad 9} \right)\end{matrix}$

[0112] The process advances from the fifth state to a tenth state 660.In the tenth state 660, the process updates the corresponding errorvariances for the macroblock based on the concealment applied in thefifth state 640, and the process advances to the sixth decision block664. In one embodiment with weighted sums, the error variance iscalculated from expression in Equation 10. $\begin{matrix}{\sigma_{q_{k}}^{2} = {{E\left\{ \left( {{\hat{q}}_{k} - {\overset{\sim}{q}}_{k}} \right)^{2} \right\}} = \frac{\sigma_{q_{k}{(1)}}^{2} \cdot \sigma_{q_{k}{(2)}}^{2}}{\sigma_{q_{k}{(1)}}^{2} + \sigma_{q_{k}{(2)}}^{2}}}} & \left( {{Eq}.\quad 10} \right)\end{matrix}$

[0113] In some situations, an entire frame is dropped or lost. Oneembodiment of the invention advantageously repeats the previous frame,or interpolates between the previous frame and the next frame, inresponse to a detection of a frame that is missing from a framesequence. In a real-time application, the display of the sequence offrames can be slightly delayed to allow the decoder time to receive thenext frame, to decode the next frame, and to generate the interpolatedreplacement frame from the previous frame and the next frame. Themissing frame can be detected by calculating a frame rate from receivedframes and by calculating an expected time to receive a subsequentframe. When a frame does not arrive at the expected time, it is replacedwith the previous frame or interpolated from the previous and nextframes. One embodiment of the process further resynchronizes theavailable audio portion to correspond with the displayed images.

[0114] Data corruption is an occasionally unavoidable occurrence.Various techniques can help conceal errors in the transmission orreception of video data. However, standard video decoding techniques caninefficiently declare error-free data as erroneous. For example, theMPEG-4 standard recommends dumping an entire macroblock when an error isdetected in the macroblock. The following techniques illustrate thatdata for some macroblocks can be reliably recovered and used from videopackets with corruption. For example, a macroblock in an MPEG-4 systemcan contain six 8-by-8 image blocks. Four of the image blocks encodeluminosity, and two of the image blocks encode chromaticity. In oneconventional system, all six of the image blocks are discarded even if atransmission error were only to affect one image block.

[0115]FIGS. 7A and 7B illustrate sample video packets. In an MPEG-4system, video packets include resynchronization markers to indicate thestart of a video packet. The number of macroblocks within a video packetcan vary.

[0116]FIG. 7A illustrates a sample of a video packet 700 with DC and ACcomponents for an I-VOP. The video packet 700 includes a video packetheader 702, which includes the resynchronization marker and other headerinformation that can be used to decode the macroblocks of the packet,such as the macroblock number of the first macroblock in the packet andthe quantization parameter (QP) to decode the packet. A DC portion 704can include mcbpc, dquant, and dc data, such as luminosity. A DC marker706 separates the DC portion 704 from an AC portion 708. In oneembodiment, the DC marker 706 is a 19-bit binary string “110 1011 00000000 0001.” The AC portion 708 can include an ac_red flag and othertextual information.

[0117]FIG. 7B illustrates a video packet 720 for a P-VOP. The videopacket 720 includes a video packet header 722 similar to the videopacket header 702 of FIG. 7A. The video packet 720 further includes amotion vector portion 724, which includes motion data. A motion marker726 separates the motion data in the motion vector portion 724 fromtexture data in a DCT portion 728. In one embodiment, the motion markeris a 17-bit binary string “1 1111 0000 0000 0001.”

[0118]FIG. 8 illustrates an example of discarding a corruptedmacroblock. Reversible variable length codes (RVLC) are designed toallow data, such as texture codes, to be read or decoded in both aforward direction 802 and a reverse or backward direction 804. Forexample, in the forward direction 802 with N macroblocks, a firstmacroblock 806, MB #0, is read first and a last macroblock 808, MB #N−1,is read last. An error can be located in a macroblock 810, which can beused to define a range of macroblocks 812 that are discarded.

[0119]FIG. 9 is a flowchart that generally illustrates a processaccording to an embodiment of the invention of partial RVLC decoding ofdiscrete cosine transform (DCT) portions of corrupted packets. Theprocess starts at a first state 904 by reading macroblock information,such as the macroblock number, of the video packet header of the videopacket. The process advances from the first state 904 to a second state908.

[0120] In the second state 908, the process inspects the DC portion orthe motion vector portion of the video packet, as applicable. Theprocess applies syntactic and logic tests to the video packet header andto the DC portion or motion vector portion to detect errors therein. Theprocess advances from the second state 908 to a first decision block912.

[0121] In the first decision block 912, the exemplary process determineswhether there was an error in the video packet header from the firststate 904 or the DC portion or motion vector portion from the secondstate 908. The first decision block 912 proceeds to a third state 916when the error is detected. When the error is not detected, the processproceeds from the first decision block 912 to a fourth state 920.

[0122] In the third state 916, the process discards the video packet. Itwill be understood by one of ordinary skill in the art that errors inthe video packet header or in the DC portion or motion vector portioncan lead to relatively severe errors if incorrectly decoded. In oneembodiment, error concealment techniques are instead invoked, and theprocess ends. The process can be reactivated later to read another videopacket.

[0123] In the fourth state 920, the process decodes the video packet inthe forward direction. In one embodiment, the process decodes the videopacket according to standard MPEG-4 RVLC decoding techniques. Oneembodiment of the process maintains a count of macroblocks in amacroblocks counter. The header at the beginning of the video packetincludes a macroblock index, which can be used to initialize themacroblocks counter. As decoding proceeds in the forward direction, themacroblock counter increments. When an error is encountered, oneembodiment removes one count from the macroblocks counter such that themacroblock counter contains the number of completely decodedmacroblocks.

[0124] In addition, one embodiment of the process stores all codewordsas leaves of a binary tree. Branches of the binary tree are labeled witheither a 0 or a 1. One embodiment of the process uses two different treeformats depending on whether the macroblock is intra or inter coded.When decoding in the forward direction, bits from the video packet areretrieved from a bit buffer containing the RVLC data, and the processtraverses the data in the tree until one of 3 events is encountered.These events correspond to a first event where a valid codeword isreached at a leaf-node; a second event where an invalid leaf of thebinary tree (not corresponding to any RVLC codeword) is reached; and athird event where the end of the bit buffer is reached.

[0125] The first event indicates no error. With no error, a valid RVLCcodeword is mapped, such as via a simple lookup table, to itscorresponding leaf-node (last, run, level). In one embodiment, thisinformation is stored in an array. When an entire 8-by-8 block isdecoded, as indicated by the presence of an RVLC codeword with last=1,the process proceeds to decode the next block until an error isencountered or the last block is reached.

[0126] The second event and the third event correspond to errors. Theseerrors can be caused by a variety of error conditions. Examples of errorconditions include an invalid RVLC codeword, such as wrong marker bitsin the expected locations of ESCAPE symbols; decoded codeword from anESCAPE symbol results in (run, length, level) information that shouldhave been encoded by a regular (non-ESCAPE) symbol; more than 64 (or 63for the case of Intra-blocks with DC coded separately from AC) DCTcoefficients in an 8-by-8 block; extra bits remaining after successfullydecoding all expected DCT coefficients of all 8-by-8 blocks in a videopacket; and insufficient bits to decode all expected 8-by-8 blocks invideo packet. These conditions can be tested sequentially. For example,when testing for extra bits remaining, the condition is tested after allthe 8-by-8 blocks in the video packet are processed. In another example,the testing of the number of DCT coefficients can be performed on ablock-by-block basis. The process advances from the fourth state 920 toa second decision block 924. However, it will be understood by theskilled practitioner that the fourth state 920 and the second decisionblock 924 can be included in a loop, such as a FOR loop.

[0127] In the second decision block 924, the process determines whetherthere has been an error in the forward decoding of the video packet asdescribed in the fourth state 920 (in the forward direction). Theprocess proceeds from the second decision block 924 to a fifth state 928when there is no error. If there is an error in the forward decoding,the process proceeds from the second decision block 924 to a sixth state932 and to a tenth state 948. Upon an error in forward decoding, theprocess terminates further forward decoding and records the errorlocation and type of error in the tenth state 948. The error location inthe forward direction, L₁, and the number of completely decodedmacroblocks in the forward direction, N₁, will be described in greaterdetail later in connection with FIGS. 10-13.

[0128] In the fifth state 928, the process reconstructs the DCTcoefficient blocks and ends. In one embodiment, the reconstructionproceeds according to standard MPEG-4 techniques. It will be understoodby one of ordinary skill in the art that the process can be reactivatedto process the next video packet.

[0129] In the sixth state 932, the process loads the video packet datato a bit buffer. In order to perform partial RVLC decoding, detection ofthe DC (for I-VOP) or Motion (for P-VOP) markers for each video packetshould be obtained without prior syntax errors or data overrun. In oneembodiment, a circular buffer that reads data for the entire packet isused to obtain the remaining bits for a video packet by unpacking eachbyte to 8 bits.

[0130] The process removes stuffing bits from the end of the buffer,which leaves only data bits in the RVLC buffer. During parsing of thevideo packet header and motion vector portion or DC portion of the videopacket, the expected number of macroblocks, the type of each onemacroblock (INTRA or INTER), whether a macroblock is skipped or not, howmany and which of the expected 4 luminance and 2 chrominance 8-by-8blocks have been coded and should thus be present in the bitstream, andwhether INTRA blocks have 63 or 64 coefficients (i.e., whether their DCcoefficient is coded together or separate from the AC coefficients)should be known. This information can be stored in a data structure withthe RVLC data bits. The process advances from the sixth state 932 to aseventh state 936.

[0131] In the seventh state 936, the process performs reversiblevariable length code (RVLC) decoding in the backward direction on thevideo packet. In one embodiment, the process performs the backwarddecoding on the video packet according to standard MPEG-4 RVLC decodingtechniques. The maximum number of decoded codewords should be recoveredin each direction. One embodiment of the process maintains the number ofcompletely decoded macroblocks encountered in the reverse direction in acounter. In one embodiment, the counter is initialized with a value fromthe video packet header that relates to the number of macroblocksexpected in the video packet, N, and the counter counts down asmacroblocks are read. The process advances from the seventh state 936 toan eighth state 940.

[0132] In the eighth state 940, the process detects an error in thevideo packet from the backward decoding and records the error and thetype of error. In addition to the errors for the forward directiondescribed earlier in connection with the fourth state 920, another errorthat can occur in the reverse decoding direction occurs when the lastdecoded codeword, i.e., the first codeword in the reverse direction,decodes to a codeword with last=0. Advantageously, detection of thelocation of the error in the reverse direction can reveal ranges of datawhere such data is still usable. Use of the error location in thereverse or backward direction, L₂, and use of the number of completelydecoded macroblocks in the reverse direction, N₂, will be describedlater in connection with FIGS. 10-13.

[0133] In the exemplary process, different decoding trees (INTRA/INTER)are used for reverse decoding direction than in the forward decodingdirection. In one embodiment, the reverse decoding trees are obtained byreversing the order of bits for each codeword. In addition, oneembodiment modifies the symbol decoding routine to take into accountthat a sign bit that is coming last in forward decoding is encounteredfirst in backward decoding; and that Last=1 indicates the last codewordof an 8-by-8 block in forward decoding, but indicates the first codewordin reverse decoding. When decoding in the reverse direction, the veryfirst codeword should have last=1 or otherwise an error is declared.

[0134] When data is read in the reverse order, the process looks aheadby one symbol when decoding a block. If a codeword with last=1 isreached, the process has reached the end of reverse decoding of thecurrent 8-by-8 block, and the process advances to the next block. Inaddition, the order of the blocks is reversed for the same reason. Forexample, if 5 INTER blocks followed by 3 INTRA blocks are expected inthe forward direction, 3 INTRA blocks followed by 5 INTER blocks shouldbe expected in the reverse direction. The process advances from theeighth state 940 to a ninth state 944.

[0135] In the ninth state 944, the process discards overlapping errorregions from the forward and the reverse decoding directions. The 2arrays of decoded symbols are compared to evaluate overlap in errorbetween the error obtained during forward RVLC decoding and the errorobtained during reverse RVLC decoding to partially decode the videopacket. Further details of partial decoding will be described in greaterdetail later in connection with FIGS. 10-13. It will be understood byone of ordinary skill in the art that that in the process describedherein, the arrays contain the successfully decoded codewords before anydecoding error has been declared in each direction. If there is nooverlap between successfully decoded regions in forward and reversedirection at the bit-level and also at the DCT (Macroblock) level, thenone embodiment performs a conservative backtracking of a predeterminednumber of bits, T, such as about 90 bits in each direction, i.e., thelast 90 bits in each direction are discarded. Those codewords thatoverlap (in the bit buffer) or decode to DCT coefficients that overlap(in the DCT buffer) are discarded. In addition, one embodiment retainsonly entire INTER macroblocks (no partial macroblock DCT data orIntra-coded macroblocks) in the decoding buffers. The remainingcodewords are then used to reconstruct the 8-by-8 DCT values forindividual blocks, and the process ends. It will be understood that theprocess can be reactivated to process the next video packet.

[0136] The process illustrated in FIG. 9 reveals the location of theerror (the bit location) in the forward direction, L₁; the location ofthe error in the reverse direction, L₂; the type of error that wasencountered in the forward direction and in the reverse direction; theexpected length of the video packet, L; the number of expectedmacroblocks in the video packet, N, the number of completely decodedmacroblocks in the forward direction, N₁; and the number of completelydecoded macroblocks in the reverse direction, N₂.

[0137] FIGS. 10-13 illustrate partial RVLC decoding strategies. In oneexemplary partial RVLC decoding process, a partial decoding strategy forextraction of useful data from a video packet is selected according toone of four outcomes. Processing of a first outcome, where L₁+L₂<L, andN₁+N₂<N will be described later in connection with FIG. 10. Processingof a second outcome, where L₁+L₂<L, and N₁+N₂>=N, will be describedlater in connection with FIG. 11. Processing of a third outcome, whereL₁+L₂ >=L, and N₁+N₂<N, will be described later in connection with FIG.12. Processing of a fourth outcome, where L₁+L₂>=L, and N₁+N₂>=N, willbe described later in connection with FIG. 13.

[0138]FIG. 10 illustrates a partial decoding strategy used when L₁+L₂<L,and N₁+N₂<N. A first portion 1002 of FIG. 10 indicates the bit errorpositions, L₁ and L₂. A second portion 1004 indicates the completelydecoded macroblocks in the forward direction, N₁, and in the reversedirection, N₂. A third portion 1006 indicates a backtracking of bits, T,from the bit error locations. It will be understood by one of ordinaryskill in the art that the number selected for the backtracking of bits,T, can vary in a very broad range and can even be different in theforward direction and in the reverse direction. In one embodiment, thevalue of T is 90 bits.

[0139] The exemplary process apportions the video packet in a firstpartial packet 1010, a second partial packet 1012, and a discardedpartial packet 1014. The first partial packet 1010 may be used by thedecoder and includes complete macroblocks up to a bit positioncorresponding to L₁−T. The second partial packet 1012 may also be usedby the decoder and includes complete macroblocks from a bit positioncorresponding to L−L₂+T to the end of the packet, L, such that thesecond partial packet is about L₂−T in size. As described in greaterdetail later in connection with FIG. 14, one embodiment of the processdiscards intra blocks in the first partial packet 1010 and in the secondpartial packet 1012, even if the intra blocks are identified asuncorrupted. The discarded partial packet 1014, which includes theremaining portion of the video packet, is discarded.

[0140]FIG. 11 illustrates a partial decoding strategy used when L₁+L₂<L,and N₁+N₂>=N. A first portion 1102 of FIG. 11 indicates the bit errorpositions, L₁ and L₂. A second portion 1104 indicates the completelydecoded macroblocks in the forward direction, N₁, and in the reversedirection, N₂.

[0141] The exemplary process apportions the video packet in a firstpartial packet 1110, a second partial packet 1112, and a discardedpartial packet 1114. The first partial packet 1110 may be used by thedecoder and includes complete macroblocks from the start of the videopacket to the macroblock corresponding to N−N₂−1. The second partialpacket 1112 may also be used by the decoder and includes the (N₁+1)thmacroblock to the last macroblock in the video packet, such that thesecond partial packet 1112 is about N−N₁−1 in size. One embodiment ofthe process discards intra blocks in the first partial packet 1110 andin the second partial packet 1112, even if the intra blocks areidentified as uncorrupted. The discarded partial packet 1114, whichincludes the remaining portion of the video packet, is discarded.

[0142]FIG. 12 illustrates a partial decoding strategy used whenL₁+L₂>=L, and N₁+N₂<N. A first portion 1202 of FIG. 12 indicates the biterror positions, L₁ and L₂. A second portion 1204 indicates thecompletely decoded macroblocks in the forward direction, N₁, and in thereverse direction, N₂.

[0143] The exemplary process apportions the video packet in a firstpartial packet 1210, a second partial packet 1212, and a discardedpartial packet 1214. The first partial packet 1210 may be used by thedecoder and includes complete macroblocks from the beginning of thevideo packet to a macroblock at N−b_mb(L₂), where b_mb(L₂) denotes themacroblock at the bit position L₂. The second partial packet 1212 mayalso be used by the decoder and includes the complete macroblocks fromthe bit position corresponding to L₁ to the end of the packet. Oneembodiment of the process discards intra blocks in the first partialpacket 1210 and in the second partial packet 1212, even if the intrablocks are identified as uncorrupted. The discarded partial packet 1214,which includes the remaining portion of the video packet, is discarded.

[0144]FIG. 13 illustrates a partial decoding strategy used whenL₁+L₂>=L, and N₁+N₂>=N. A first portion 1302 of FIG. 13 indicates thebit error positions, L₁ and L₂. A second portion 1304 indicates thecompletely decoded macroblocks in the forward direction, N₁, and in thereverse direction, N₂.

[0145] The exemplary process apportions the video packet in a firstpartial packet 1310, a second partial packet 1312, and a discardedpartial packet 1314. The first partial packet 1310 may be used by thedecoder and includes complete macroblocks up to the bit positioncorresponding to the lesser of N−b_mb(L₂), where b_mb(L₂) denotes thelast complete macroblock up to bit position L₂, and the completemacroblocks up to (N−N₂−1)th macroblock. The second partial packet 1312may also be used by the decoder and includes the number of completemacroblocks counting from the end of the video packet corresponding tothe lesser of N−f_mb(L₁), where f_mb(L₁) denotes the last macroblock inthe reverse direction that is uncorrupted as determined by the forwarddirection, and the number of complete macroblocks corresponding toN−N₁−1. One embodiment of the process discards intra blocks in the firstpartial packet 1310 and in the second partial packet 1312, even if theintra blocks are identified as uncorrupted. The discarded partial packet1314, which includes the remaining portion of the video packet, isdiscarded.

[0146]FIG. 14 illustrates a partially corrupted video packet 1402 withat least one intra-coded macroblock. In one embodiment, an intra-codedmacroblock in a portion of a partially corrupted video packet isdiscarded even if the intra-coded macroblock is in a portion of thepartially corrupted video packet that is considered uncorrupted.

[0147] A decoding process, such as the process described in connectionwith FIGS. 9 to 13, allocates the partially corrupted video packet 1402to a first partial packet 1404, a corrupted partial packet 1406, and asecond partial packet 1408. The first partial packet 1404 and the secondpartial packet 1408 are considered error-free and can be used. Thecorrupted partial packet 1406 includes corrupted data and should not beused.

[0148] However, the illustrated first partial packet 1404 includes afirst intra-coded macroblock 1410, and the illustrated second partialpacket 1408 includes a second intra-coded macroblock 1412. One processaccording to an embodiment of the invention also discards an intra-codedmacroblock, such as the first intra-coded macroblock 1410 or the secondintra-coded macroblock 1412, when any error or corruption is detected inthe video packet, and the process advantageously continues to use therecovered macroblocks corresponding to error-free macroblocks. Instead,the process conceals the intra-coded macroblocks of the partiallycorrupted video packets.

[0149] One embodiment of the invention partially decodes intra-codedmacroblocks from partially corrupted packets. According to the MPEG-4standard, any data from a corrupted video packet is dropped. Intra-codedmacroblocks can be encoded in both I-VOPs and in P-VOPs. As provided inthe MPEG-4 standard, a DC coefficient of an intra-coded macroblockand/or the top-row and first-column AC coefficient of the intra-codedmacroblock can be predictively coded from the intra-coded macroblock'sneighboring intra-coded macroblocks.

[0150] Parameters encoded in the video bitstream can indicate theappropriate mode of operation. A first parameter, referred to in MPEG-4as “intra_dc_vlc_thr,” is located in the VOP header. As set forth inMPEG-4, the first parameter, intra_dc_vlc_thr, is encoded to one of 8codes as described in Table I, where QP indicates a quantizationparameter. TABLE I Index Meaning Code 0 Use Intra DC VLC for entire VOP000 1 Switch to Intra AC VLC at running QP>=13 001 2 Switch to Intra ACVLC at running QP>=15 010 3 Switch to Intra AC VLC at running QP>=17 0114 Switch to Intra AC VLC at running QP>=19 100 5 Switch to Intra AC VLCat running QP>=21 101 6 Switch to Intra AC VLC at running QP>=23 110 7Use Intra AC VLC for entire VOP 111

[0151] The intra_dc_vlc_thr code of “000” corresponds to separating DCcoefficients from AC coefficients in intra-coded macroblocks. Withrespect to an I-VOP, the setting of the intra_dc_vlc_thr parameter to“000” results in the placement by the encoder of the DC coefficientbefore the DC marker, and the placement of the AC coefficients after theDC marker.

[0152] With respect to a P-VOP, the setting of the intra_dc_vlc_thrparameter to “000” results in the encoder placing the DC coefficientsimmediately after the motion marker, together with the cbpy andac_pred_flag information. It will be understood that the value of theintra_dc_vlc_thr parameter is selected at the encoding level. For errorresilience, video bitstreams may be relatively more robustly encodedwith the intra_dc_vlc_thr parameter set to 000. Nonetheless, oneembodiment of the invention advantageously detects the setting of theintra_dc_vlc_thr parameter to “000,” and monitors for the motion markerand/or the DC marker. If the corresponding motion marker and/or isobserved without an error, the process classifies the DC informationreceived ahead of the motion marker and/or DC marker and uses the DCinformation in decoding. Otherwise, the DC information is dropped.

[0153] A second parameter, referred to in MPEG-4 as “ac_pred_flag” islocated after the motion marker/DC marker, but before RVLC texture data.The “ac_pred_flag” parameter instructs the encoder to differentiallyencode and the decoder to differentially decode the top row and firstcolumn of DCT coefficients (a total of 14 coefficients) from aneighboring block that has the best match with the current block withregard to DC coefficients. The neighboring block with the smallestdifference is used as a prediction block as shown in FIG. 15.

[0154]FIG. 15 illustrates a sequence of macroblocks with AC prediction.FIG. 15 includes a first macroblock 1502, A, a second macroblock 1504,B, a third macroblock 1506, C, a fourth macroblock 1508, D, a fifthmacroblock 1510, X, and a sixth macroblock 1512, Y. The fifth macroblock1510, X, and the sixth macroblock 1512, Y, are encoded with ACprediction enabled. A first column of DCT coefficients from the firstmacroblock 1502, A, is used in the fifth macroblock 1510, X, and thesixth macroblock 1512, Y. The top row of coefficients from the thirdmacroblock 1506, C, or from the fourth macroblock 1508, D, is used toencode the top row of the fifth macroblock 1510, X, or the sixthmacroblock 1512, Y, respectively.

[0155] It will be understood that for error resilience, the encodershould disable the AC prediction or differential encoding forintra-coded macroblocks. With the AC prediction disabled, intra-codedmacroblocks that correspond to either the first or second “good” part ofthe RVLC data can be used.

[0156] In one embodiment, with AC prediction enabled, the intra-codedmacroblocks of the “good” part of the RVLC data can be dropped asdescribed earlier in connection with FIG. 14.

[0157] In addition, one decoder or decoding process according to anembodiment of the invention further determines whether the intra-codedmacroblock, referred to as “suspect intra-coded macroblock” can be usedeven with AC prediction enabled. The decoder determines whether anotherintra-coded macroblock exists to the immediate left or immediately abovethe suspect intra-coded macroblock. When no such other intra-codedmacroblock exists, the suspect intra-coded macroblock is labeled “good,”and is decoded and used.

[0158] One decoder further determines whether any of the othermacroblocks to the immediate left or immediately above the suspectintra-coded macroblock have not been decoded. If there are any suchmacroblocks, the suspect intra-coded macroblock is not used.

[0159]FIG. 16 illustrates a bit structure for an MPEG-4 datapartitioning packet. Data partitioning is an option that can be selectedby the encoder. The data partitioning packet includes a resync marker1602, a macroblock_number 1604, a quant_scale 1606, a header extensioncode (HEC) 1608, a motion and header information 1610, a motion marker1612, a texture information 1614, and a resync marker 1616.

[0160] The MPEG-4 standard allows the DC portion of frame data to beplaced in the data partitioning packet either before or after the ACportion of frame data. The order is determined by the encoder. When datapartitioning is enabled, the encoder includes motion vectors togetherwith “not-coded” and “mcbpc” information in the motion and headerinformation 1610 ahead of the motion marker 1612 as part of headerinformation as shown in FIG. 16.

[0161] When an error is detected in the receiving of a packet, but theerror occurs after the motion marker 1612, one embodiment of theinvention uses the data received ahead of the motion marker 1612. Oneembodiment predicts a location for the motion marker 1612 and detects anerror based on whether or not the motion marker 1612 was observed in thepredicted location. Depending on the nature of the scenes encoded, thedata included in the motion and header information 1610 can yield awealth amount of information that can be advantageously recovered.

[0162] For example, when the “not coded” flag is set, a macroblockshould be copied from the same location in the previous frame by thedecoder. The macroblocks corresponding to the “not coding” flag can bereconstructed safely. The “mcbpc” identifies which of the 6 8-by-8blocks that form a macroblock (4 for luminance and 2 for chrominance)have been coded and thus include corresponding DCT coefficients in thetexture information 1614.

[0163] When RVLC is enabled, the texture information 1614 is furtherdivided into a first portion and a second portion. The first portionimmediately following the motion marker 1612 includes “cbpy”information, which identifies which of the 4 luminance 8-by-8 blocks areactually coded and which are not. The cbpy information also includes aDC coefficient for those intra-coded macroblocks in the packet for whichthe corresponding “Intra DC VLC encoding” has been enabled.

[0164] The cbpy information further includes an ac_pred_flag, whichindicates whether the corresponding intra-coded macroblocks have beendifferentially encoded with AC prediction by the encoder from othermacroblocks that are to the immediate left or are immediately above themacroblock. In one embodiment, the decoder uses all of or a selection ofthe cbpy information, the DC coefficient, and the ac_pred_flag inconjunction with the presence or absence of a first error-free portionof the DCT data in the texture information 1614 to assess which part canbe safely decoded. In one example, the presence of such a good portionof data indicates that DC coefficients of intra macroblocks andcbpy-inferred non-coded Y-blocks of a macroblock can be decoded.

[0165] One technique used in digital communications to increase therobustness of transmitted or stored digital information is forward errorcorrection (FEC) coding. FEC coding includes the addition of errorcorrection information before data is stored or transmitted. Part of theFEC process can also include other techniques such as bit-interleaving.Both the original data and the error correction information are storedor transmitted, and when data is lost, the FEC decoder can reconstructthe missing data from the data that it received and the error correctioninformation.

[0166] Advantageously, embodiments of the invention decode FEC codes inan efficient and backward compatible manner. One drawback to FEC codingtechniques is that the error correction information increases the amountof data that is stored or transmitted, referred to as overhead. FIG. 17illustrates one example of a tradeoff between block error rate (BER)correction capability versus overhead. A horizontal axis 1710corresponds to an average BER correction capability. A vertical axis1720 corresponds to an amount of overhead, expressed in FIG. 17 inpercentage. A first curve 1730 corresponds to a theoretical bit overheadversus BER correction capability. A second curve 1740 corresponds to oneexample of an actual example of overhead versus BER correctioncapability. Despite the overhead costs, the benefits of receiving theoriginal data as intended can outweigh the drawbacks of increased datastorage or transmission, or the drawbacks of a revised bit allocation ina bandwidth limited system.

[0167] Another disadvantage to FEC coding is that the data, as encodedwith FEC codes, may no longer be compatible with systems and/orstandards in use prior to FEC coding. Thus, FEC coding is relativelydifficult to add to existing systems and/or standards, such as MPEG-4.

[0168] To be compatible with existing systems, a video bitstream shouldbe compliant with a standard syntax, such as MPEG-4 syntax. To retaincompatibility with existing systems, embodiments of the inventionadvantageously decode FEC coded bitstreams that are encoded only withsystematic FEC codes and not non-systematic codes, and retrieve FECcodes from identified user data video packets.

[0169]FIG. 18 illustrates a video bitstream with systematic FEC data.FEC codes can correspond to either systematic codes or non-systematiccodes. A systematic code leaves the original data untouched and appendsthe FEC codes separately. For example, a conventional bitstream caninclude a first data 1810, a second data 1830, and so forth. Withsystematic coding, the original data, i.e., the first data 1810 and thesecond data 1830, is preserved, and the FEC codes are providedseparately. An example of the separate FEC code is illustrated by afirst FEC code 1820 and a second FEC code 1840 in FIG. 18. In oneembodiment, the data is carried in a VOP packet, and the FEC codes arecarried in a user data packet, which follows the corresponding VOPpacket in the bitstream. One embodiment of the encoder includes a packetof FEC codes in a user data video packet for each VOP packet. However,it will be understood that depending on decisions made by the encoder,less than every corresponding data may be supplemented with FEC codes.

[0170] By contrast, in a non-systematic code, the original data and theFEC codes are combined. It will be understood by one of ordinary skillin the art that the application of FEC techniques that generatenon-systematic code result in bitstreams should be avoided where theapplicable video standard does not specify FEC coding.

[0171] A wide variety of FEC coding types can be used. In oneembodiment, the FEC coding techniques correspond toBose-Chaudhuri-Hocquenghem (BCH) coding techniques. In one embodiment, ablock size of 511 is used. In the illustrated configurations, the FECcodes are applied at the packetizer level, as opposed to another level,such as a channel level.

[0172] In the context of an MPEG-4 system, one way of including theseparate systematic error correction data, as shown by the first FECcode 1820 and the second FEC code 1840, is to include the errorcorrection data in a user data video packet. The user data video packetcan be ignored by a standard MPEG-4 decoder. In the MPEG-4 syntax, adata packet is identified as a user data video packet in the videobitstream by a user data start code, which is a bit string of 000001B2in hexadecimal (start code value of B2), as the start code of the datapacket. Various data can be included with the FEC codes in the user datavideo packet. In one embodiment, a user data header code identifies thetype of data in the user data video packet. For example, a 16-bit codefor the user data header code can identify that data in the user datavideo packet is FEC code. In another example, such as in a standard yetto be defined, the FEC codes of selected data are carried in a dedicateddata packet with a unique start code.

[0173] It will be appreciated that error correction codes correspondingto all the data in the video bitstream can be included in the user datavideo packet. However, this disadvantageously results in a relativelylarge amount of overhead. One embodiment of the invention advantageouslyencodes FEC codes from only a selected portion of the data in the videobitstream. The user data header code in the user data video packet canfurther identify the selected data to which the corresponding FEC codesapply. In one example, FEC codes are provided and decoded only for datacorresponding to at least one of motion vectors, DC coefficients, andheader information.

[0174]FIG. 19 is a flowchart 1900 generally illustrating a process ofdecoding systematically encoded FEC data in a video bitstream. Theprocess can be activated once per VOP. The decoding process isadvantageously compatible with video bitstreams that include FEC codingand those that do not. The process starts at a first state 1904, wherethe process receives the video bitstream. The video bitstream can bereceived wirelessly, through a local or a remote network, and canfurther be temporarily stored in buffers and the like. The processadvances from the first state 1904 to a second state 1908.

[0175] In the second state 1908, the process retrieves the data from thevideo bitstream. For example, in an MPEG-4 decoder, the process canidentify those portions corresponding to standard MPEG-4 video data andthose portions corresponding to FEC codes. In one embodiment, theprocess retrieves the FEC codes from a user data video packet. Theprocess advances from the second state 1908 to a decision block 1912.

[0176] In the decision block 1912, the process determines whether FECcodes are available to be used with the other data retrieved in thesecond state 1908. When FEC codes are available, the process proceedsfrom the decision block 1912 to a third state 1916. Otherwise, theprocess proceeds from the decision block 1912 to a fourth state 1920. Inanother embodiment, the decision block 1912 instead determines whetheran error is present in the received video bitstream. It will beunderstood that the corresponding portion of the video bitstream that isinspected for errors can be stored in a buffer. When an error isdetected, the process proceeds from the decision block 1912 to the thirdstate 1916. When no error is detected, the process proceeds from thedecision block 1912 to the fourth state 1920.

[0177] In the third state 1916, the process decodes the FEC codes toreconstruct the faulty data and/or verify the correctness of thereceived data. The third state 1916 can include the decoding of thenormal video data that is accompanied with the FEC codes. In oneembodiment, only selected portions of the video data supplemented withFEC codes, and the process reads header codes or the like, whichindicate the data to which the retrieved FEC codes correspond.

[0178] The process advances from the third state to an optional fifthstate 1924. One encoding process further includes other data in the samepacket as the FEC codes. For example, this other data can correspond toat least one of a count of the number of motion vectors, a count of thenumber of bits per packet that are encoded between the resync field andthe motion marker field. This count allows a decoder to advantageouslyresynchronize to a video bitstream earlier than at a place in abitstream with the next marker that permits resynchronization. Theprocess advances from the optional fifth state 1924 to the end. Theprocess can be reactivated to process the next batch of data, such asanother VOP.

[0179] In the fourth state 1920, the process uses the retrieved videodata. The retrieved data can be the normal video data corresponding to avideo bitstream without embedded FEC codes. The retrieved data can alsocorrespond the normal video data that is maintained separately in thevideo bitstream from the embedded FEC codes. The process then ends untilreactivated to process the next batch of data.

[0180]FIG. 20 is a block diagram generally illustrating one process ofusing a ring buffer in error resilient decoding of video data. Data canbe transmitted and/or received in varying bit rates and in bursts. Forexample, network congestion can cause delays in the receipt of packetsof data. The dropping of data, particularly in wireless environments,can also occur. In addition, a relatively small amount of received datacan be stored in a buffer until it is ready to be processed by adecoder.

[0181] One embodiment of the invention advantageously uses a ring bufferto store incoming video bitstreams for error resilient decoding. A ringbuffer is a buffer with a fixed size. It will be understood that thesize of the ring buffer can be selected in a very broad range. A ringbuffer can be constructed from an addressable memory, such as a randomaccess memory (RAM). Another name for a ring buffer is circular buffer.

[0182] The storing of the video bitstream in the ring buffer isadvantageous in error resilient decoding, including error resilientdecoding of video bitstreams in a wireless MPEG-4 compliant receiver,such as a video-enabled cellular telephone. With error resilientdecoding techniques, data from the video bitstream may be read from thevideo bitstream multiple times, in multiple locations, and in multipledirections. The ring buffer permits the decoder to retrieve data fromvarious portions of the video bitstream in a reliable and efficientmanner. In one test, use of the ring buffer sped access to bitstreamdata by a factor of two.

[0183] In contrast to other buffer implementations, data isadvantageously not flushed from a ring buffer. Data enters and exits thering buffer in a first-in first-out (FIFO) manner. When a ring buffer isfull, the addition of an additional element overwrites the first elementor the oldest element in the ring buffer.

[0184] The block diagram of FIG. 20 illustrates one configuration of aring buffer 2002. Data received from the video bitstream is loaded intothe ring buffer 2002 as the data is received. In one embodiment, themodules of the decoder that decode the video bitstream do not access thevideo bitstream directly, but rather, access the video bitstream datathat is stored in the ring buffer 2002. Also, the skilled practitionerwill appreciate that the ring buffer 2002 can reside either ahead of orbehind a VOP decoder in the data flow. However, the placement of thering buffer 2002 ahead of the VOP decoder saves memory for the ringbuffer 2002, as the VOP is in compressed form ahead of the VOP decoder.

[0185] The video bitstream data that is loaded into the ring buffer 2002is represented in FIG. 20 by a bitstream file 2004. Data logginginformation, including error logging information, such as error flags,is also stored in the ring buffer 2002 as it is generated. The datalogging information is represented in FIG. 20 as a log file 2006. In oneembodiment, a log interface between H.223 output and decoder inputadvantageously synchronizes or aligns the data logging information inthe ring buffer 2002 with the video bitstream data.

[0186] A first arrow 2010 corresponds to a location (address) in thering buffer 2002 in which data is stored. As data is added to the ringbuffer 2002, the ring buffer 2002 conceptually rotates in the clockwisedirection as shown in FIG. 20. A second arrow 2012 indicates anillustrative position from which data is retrieved from the ring buffer2002. A third arrow 2014 can correspond to an illustrative byte positionin the packet that is being retrieved or accessed. Packet start codes2016 can be dispersed throughout the ring buffer 2002.

[0187] When data is retrieved from the ring buffer 2002 for decoding ofa VOP with video packets enabled, one embodiment of the decoder inspectsthe corresponding error-flag of each packet. When the packets are foundto be corrupted, the decoder skips the packets until the decoderencounters a clean or error-free packet. When the decoder encounters apacket, it stores the appropriate location information in an indextable, which allows the decoder to access the packet efficiently withoutrepeating a seek for the packet. In another embodiment, the decoder usesthe contents of the ring buffer 2002 to recover and use data frompartially corrupted video packets as described earlier in connectionwith FIGS. 7-16.

[0188] Table II illustrates a sample of contents of an index table,which allows relatively efficient access to packets stored in the ringbuffer 2002. TABLE II Index - Table Entry Initial Value DescriptionsValid 0 Valid flag. A value of 1 indicates that valid data correspondingto this entry information exists in the ring buffer. Past 0 Past flag, 0indicates that this index has a current or future index. Pos 0 Startposition of the packet, which indicates a position in the ring buffer.ErrorType 0 Error type. Size 0 Packet Size.

[0189] Various embodiments of the invention have been described above.Although this invention has been described with reference to thesespecific embodiments, the descriptions are intended to be illustrativeof the invention and are not intended to be limiting. Variousmodifications and applications may occur to those skilled in the artwithout departing from the true spirit and scope of the invention asdefined in the appended claims.

What is claimed is:
 1. A video decoder adapted to reconstruct corruptedvideo data comprising: a receiver circuit adapted to receive a videobitstream; a buffer coupled to the receiver circuit, where the buffer isadapted to store at least a portion of the video bitstream; a parsingcircuit adapted to distinguish video data from forward error correction(FEC) codes; an error monitoring circuit configured to detect corruptionin the video data; and an FEC decoder adapted to receive the video dataand the FEC codes, where the FEC decoder is configured to remove thecorruption in the video data to which the FEC codes apply.
 2. The videodecoder as defined in claim 1, wherein the FEC decoder decodes FEC codesthat correspond to Bose-Chaudhuri-Hocquenghem (BCH) codes.
 3. The videodecoder as defined in claim 1, wherein the buffer is a ring buffer. 4.The video decoder as defined in claim 1, wherein the parsing circuit isconfigured to retrieve the video data from a packet for a video objectplane (VOP) and retrieving the FEC codes from a user data video packetassociated with the VOP.
 5. A video decoder that decodes a videobitstream that includes forward error correction (FEC) codes, the videodecoder comprising: means for receiving the video bitstream, whichincludes both video data and FEC codes; means for retrieving video datafrom the video bitstream; means for determining if there is corruptionin a portion of the video data retrieved; means for retrieving FEC codesfrom the video bitstream in response to a detection of corruption; andmeans for using the FEC codes to reconstruct the portion of the videodata such that the portion of the video data is recovered withoutcorruption.
 6. A process of decoding a video bitstream that includesforward error correction (FEC) codes, the process comprising: receivingthe video bitstream, which includes both video data and FEC codes;retrieving video data from the video bitstream; determining if there iscorruption in a portion of the video data retrieved; retrieving FECcodes from the video bitstream in response to a detection of corruption;and using the FEC codes to reconstruct the portion of the video datasuch that the portion of the video data is recovered without corruption.7. The process as defined in claim 6, wherein the FEC codes correspondto Bose-Chaudhuri-Hocquenghem (BCH) codes.
 8. The process as defined inclaim 6, further comprising: storing the video bitstream in a buffer;retrieving the video data from the buffer when retrieving video datafrom the video bitstream; and retrieving the FEC codes from the bufferwhen retrieving the FEC codes from the video bitstream.
 9. The processas defined in claim 8, wherein the buffer is a ring buffer.
 10. Theprocess as defined in claim 6, further comprising retrieving the videodata from a packet for a video object plane (VOP) and retrieving the FECcodes from a user data video packet associated with the VOP.
 11. Theprocess as defined in claim 6, further comprising receiving a headercode that specifies a subset of video data to which the FEC codescorrespond, and applying the FEC codes only to the subset of video data.12. The process as defined in claim 6, further comprising concealing anerror in a corresponding pixel with a gray color pixel when the portionof the video data cannot be recovered without corruption.
 13. A processof decoding a video bitstream that includes forward error correction(FEC) codes, the process comprising: receiving the video bitstream,which includes both video data and FEC codes; retrieving video data fromthe video bitstream; determining if FEC codes that correspond to theretrieved video data are available; retrieving FEC codes from the videobitstream when the FEC codes are available; and using the FEC codes todecode the portion of the video data such that the portion of the videodata is recovered without corruption.
 14. The process as defined inclaim 13, wherein the FEC codes correspond to Bose-Chaudhuri-Hocquenghem(BCH) codes.
 15. The process as defined in claim 13, further comprising:storing the video bitstream in a buffer; retrieving the video data fromthe buffer when retrieving video data from the video bitstream; andretrieving the FEC codes from the buffer when retrieving the FEC codesfrom the video bitstream.
 16. The process as defined in claim 15,wherein the buffer is a ring buffer.
 17. The process as defined in claim13, further comprising retrieving the video data from a packet for avideo object plane (VOP) and retrieving the FEC codes from a user datavideo packet associated with the VOP.
 18. The process as defined inclaim 13, further comprising receiving a header code that specifies asubset of video data to which the FEC codes correspond, and applying theFEC codes only to the subset of video data.